Model Selection

Edge device deployment

# Edge device deployment

PP OCRv3 Mobile Rec

PP-OCRv3_mobile_rec is a lightweight text line recognition model developed by the PaddleOCR team. It uses the SVTR algorithm and supports Chinese and English recognition, especially focusing on Chinese scenarios.

Text Recognition Supports Multiple Languages

The Holo1-7B GGUF model is part of the Surfer-H system and is suitable for multimodal tasks such as visual document retrieval. It is particularly good at web page interaction and network monitoring, and can achieve high accuracy at a low cost.

Transformers English

Llama 3.1 Nemotron Nano VL 8B V1

Llama-3.1-Nemotron-Nano-VL-8B-V1 is an advanced document intelligent vision-language model that can query and summarize images and videos, and supports multi-environment deployment.

Llama 3.1 Nemotron Nano 4B V1.1 GGUF

Llama-3.1-Nemotron-Nano-4B-v1.1 is a large language model optimized based on Llama 3.1, achieving a good balance between accuracy and efficiency. It is suitable for various scenarios such as AI agents and chatbots.

Large Language Model

Transformers English

Acemath RL Nemotron 7B GGUF

AceMath-RL-Nemotron-7B is a mathematical reasoning model trained entirely through reinforcement learning. It is trained based on Deepseek-R1-Distilled-Qwen-7B and performs excellently in mathematical reasoning tasks. It also has certain generalization ability in coding tasks.

Large Language Model

Transformers English

Dfine Large Obj365

D-FINE is a powerful real-time object detector that achieves exceptional localization accuracy by redefining the bounding box regression task in DETR models.

Object Detection

Transformers English

Dfine Medium Obj2coco

D-FINE is a real-time object detection model that achieves exceptional localization accuracy by redefining the bounding box regression task.

Object Detection

Transformers English

Qwen2.5 VL 3B Instruct GGUF

Qwen2.5-VL-3B-Instruct is a 3B-parameter multimodal model supporting image-text generation tasks, specifically optimized for vision capabilities in llama.cpp.

Text-to-Image English

Gemma 3 27b It GGUF

GGUF quantized version of Gemma 3 with 27B parameters, supporting image-text interaction tasks

Rtdetr V2 R101vd

RT-DETRv2 is an improved real-time object detection model based on the DETR architecture, optimizing detection performance through innovations like selective multi-scale feature extraction and dynamic data augmentation.

Object Detection

Transformers English

Rtdetr V2 R34vd

RT-DETRv2 is an improved version of the real-time object detection Transformer model, enhancing performance through multi-scale feature extraction and optimized training strategies.

Object Detection

Transformers English

Qwen2 Audio 7B GGUF

Qwen2-Audio is an advanced small-scale multimodal model that supports audio and text input, enabling voice interaction without relying on speech recognition modules.

Audio-to-Text English

Featured Recommended AI Models

AIbase

Empowering the Future, Your AI Solution Knowledge Base

English 简体中文繁體中文にほんご

© 2025AIbase